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PROGRAM INSTRUMENTATION METHOD AND APPARATUS FOR 
CONSTRAINING THE BEHAVIOR OF EMBEDDED SCRIPT IN 
DOCUMENTS 



PRIORITY 

[0001] The present patent application claims priority to and incorporates by 

reference the corresponding provisional patent application serial no. 60/816,679, 
entitled, "Program Instrumentation Method and Apparatus for Constraining the 
Behavior of Embedded Script in HTML Documents", filed on June 26, 2006. 



FIELD OF THE INVENTION 

[0002] The present invention relates to the field of computer programming; 

more particularly, the present invention relates to controlling (e.g., constraining) the 
behavior of embedded script in documents (e.g., HTML documents). 



BACKGROUND OF THE INVENTION 

[0003] JavaScript has become a popular tool in building web pages. JavaScript 

programs are essentially a form of mobile code embedded in HTML documents and 
executed on client machines. With help of the Document Object Model (DOM) and 
other browser features, JavaScript programs can obtain restricted access to the client 
system and improve the functionality and appearance of web pages. 
[0004] As is the case of other forms of mobile code, JavaScript programs 

introduce potential security vulnerabilities and loopholes for malicious parties to 
exploit. As a simple example, JavaScript is often used to open a new window on the 
client. This feature provides a degree of control beyond that offered by plain HTML 
alone, allowing the new window to have customized size, position, and components 
(e.g., menu, toolbar, status bar). Unfortunately, this feature has been heavily exploited 
to generate annoying pop-ups of undesirable contents, some of which are difficult to 
"control" from a web user's point of view (e.g., control buttons out of screen 
boundary, instant respawning when closed). More severely, this feature has also been 
exploited for launching phishing attacks, where key information about the origin of the 
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web page is hidden from users (e.g., a hidden location bar), and false information 
assembled to trick users into believing malicious contents (e.g., a fake location bar). 
[0005] As another example, JavaScript is often used to store and retrieve useful 

information (e.g., a password to a web service) on the client machine as a "cookie." 
Such information is sometimes sensitive, and therefore the browser restricts the access 
to cookies based on the origin of web pages. For instance, JavaScript code from 
attacker . com will not be able to read a cookie set by mybank . com. Unfortunately, 
many web applications exhibit XSS vulnerabilities, where a malicious piece of script 
can be injected into a web page produced by a vulnerable application. The browser 
interprets the injected script as if it was intended by the same application. As a result, 
the browser's origin-based protection is circumvented, and the malicious script may 
obtain access to the cookie set by the vulnerable application. 

[0006] Thus, in general, JavaScript has been exploited to launch a wide range 

of attacks. The situation is potentially worse than for other forms of mobile code such 
as application downloading, because the user may not realize that loading web pages 
entails the execution of untrusted code. 

[0007] JavaScript, DOM, and web browsers provide some basic security 

protections. Among the commonly used are sandboxing, same-origin policy, and 
signed scripting. These only provide limited (coarse-grained) protections. There 
remain many opportunities for attacks, even if these protections are perfectly 
implemented. Representative example attacks that are not prevented by these include 
XSS, phishing, and resource abuse. 

[0008] There have been also some separate browser security tools developed, 

such as pop-up blockers and SpoofGuard. These separate solutions only provide 
protection against specific categories of attacks. In practice, it is sometimes difficult to 
deploy multiple solutions all together. In addition, there are many attacks that are 
outside of the range of protection of existing tools. Nonetheless, ideas and heuristics 
used in these tools are likely to be helpful for constructing useful security policies for 
instrumentation. 

[0009] Some schemes in the context of client-side protection against malicious 

mobile code, including those written in JavaScript, have been proposed. They provide 
only coarse-grained protection by conducting some checks on the security profile of 
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downloaded code (e.g., based on known hostile downloadables, trusted and untrusted 
URLs, and suspicious code patterns), and by preventing the execution of the code 
when the checks fail. Some also scan the content of the code for potential exploits 
based on a set of rules. None of these prior protection methods rewrite the code. The 
protection provided by an embodiment of the present invention is more fine-grained, 
because the method inspects the code for its behaviors and rewrites the code to respect 
the policy. 

[0010] All of the above mentioned protection mechanisms are deployed on the 

client side. Server-side protection has also been studied, especially in the context of 
command injection attacks. They help well-intended programmers to build web 
applications that are free of certain vulnerabilities, but cannot prevent malicious code 
from harming the client through browser-based attacks. 

[0011] Existing academic work on formalizing JavaScript focuses on helping 

programmers write good code, as opposed to thwarting malicious exploits. On the 
technical aspects, they treat JavaScript programs using the conventional program 
execution model, rather than as separate fragments embedded in HTML documents. 
They have not addressed higher-order script, which is a form of JavaScript code not 
directly available statically, but rather generated dynamically during JavaScript 
program execution. 

[0012] Program instrumentation is well-known. However, these previous 

techniques address specific questions including memory safety, debugging and testing, 
and data collection. They do not address browser safety and security questions. In 
addition, they are not sufficient for regulating the behavior of embedded JavaScript in 
HTML documents. 

SUMMARY OF THE INVENTION 

[0013] A method and apparatus is disclosed herein for constraining the 

behavior of embedded script in documents using program instrumentation. In one 
embodiment, the method comprises downloading a document with a script program 
embedded therein, inspecting the script program, and rewriting the script program to 
cause behavior resulting from execution of the script to conform to one or more 
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policies defining safety and security. The script program comprises self-modifying 
code (e.g., dynamically generated script). 



BRIEF DESCRIPTION OF THE DRAWINGS 

[00141 The present invention will be understood more fully from the detailed 

description given below and from the accompanying drawings of various embodiments 

of the invention, which, however, should not be taken to limit the invention to the 

specific embodiments, but are for explanation and understanding only. 

[0015] Figure 1 illustrates an example of script embedded in an HTML 

document; 

[0016] Figure 2 illustrates an example of an execution order of higher-order 

script; 

[0017] Figure 3A is a flow diagram of one embodiment of a process for 

instrumenting script programs embedded in documents; 

[0018] Figure 3B illustrates one embodiment of a process for performing 

instrumentation of higher-order script; 

[0019] Figure 4 illustrates one embodiment of CoreScript syntax; 

[0020] Figure 5 illustrates expression and action evaluation in one 

embodiment of CoreScript; 

[0021] Figure 6 illustrates world execution in one embodiment of CoreScript; 

[0022] Figure 7 illustrates helper functions of CoreScript semantics in one 

embodiment of CoreScript; 

[0023] Figure 8 illustrates policy satisfaction and action editing for use with 

one embodiment of CoreScript; 

[0024] Figure 9 illustrates one embodiment of edit automata; 

[0025] Figure 10 illustrates an example of automaton for a pop-up policy; 

[0026] Figure 11 illustrates an example of automaton for a cookie policy; 

[0027] Figure 12 illustrates one embodiment of syntax -directed rewriting; 

[0028] Figure 13 illustrates world execution in one embodiment of CoreScript 

extended with the policy module; 

[0029] Figure 14 illustrates an example implementation architecture; and 

[0030] Figure 15 is a block diagram of an example of a computer system. 
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DETAILED DESCRIPTION OF THE PRESENT INVENTION 
[0031] A method and apparatus for using program instrumentation to constrain 

behavior of embedded script in documents is described. In one embodiment, the 
documents comprise HTML documents. Embodiments of the present invention are 
different from existing program instrumentation in several aspects, including, but not 
limited to the handling of the specific execution model of JavaScript, the self- 
modifying capability of embedded script, and some pragmatic policy issues. 
[0032] In one embodiment, inserted security checks and dialogue warnings 

using program instrumentation are used to identify and reveal to users potentially 
malicious behaviors. The extra computation overhead is usually acceptable, because 
JavaScript programs are typically very small in size, and performance is not a major 
concern of most web pages. 

[0033] In the following description, numerous details are set forth to provide a 

more thorough explanation of the present invention. It will be apparent, however, to 
one skilled in the art, that the present invention may be practiced without these specific 
details. In other instances, well-known structures and devices are shown in block 
diagram form, rather than in detail, in order to avoid obscuring the present invention. 
[0034] Some portions of the detailed descriptions which follow are presented 

in terms of algorithms and symbolic representations of operations on data bits within a 
computer memory. These algorithmic descriptions and representations are the means 
used by those skilled in the data processing arts to most effectively convey the 
substance of their work to others skilled in the art. An algorithm is here, and 
generally, conceived to be a self-consistent sequence of steps leading to a desired 
result. The steps are those requiring physical manipulations of physical quantities. 
Usually, though not necessarily, these quantities take the form of electrical or magnetic 
signals capable of being stored, transferred, combined, compared, and otherwise 
manipulated. It has proven convenient at times, principally for reasons of common 
usage, to refer to these signals as bits, values, elements, symbols, characters, terms, 
numbers, or the like. 

[0035] It should be borne in mind, however, that all of these and similar terms 

are to be associated with the appropriate physical quantities and are merely convenient 
labels applied to these quantities. Unless specifically stated otherwise as apparent 
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from the following discussion, it is appreciated that throughout the description, 
discussions utilizing terms such as "processing" or "computing" or "calculating" or 
"determining" or "displaying" or the like, refer to the action and processes of a 
computer system, or similar electronic computing device, that manipulates and 
transforms data represented as physical (electronic) quantities within the computer 
system's registers and memories into other data similarly represented as physical 
quantities within the computer system memories or registers or other such information 
storage, transmission or display devices. 

[0036| The present invention also relates to apparatus for performing the 

operations herein. This apparatus may be specially constructed for the required 
purposes, or it may comprise a general purpose computer selectively activated or 
reconfigured by a computer program stored in the computer. Such a computer 
program may be stored in a computer readable storage medium, such as, but is not 
limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and 
magnetic-optical disks, read-only memories (ROMs), random access memories 
(RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media 
suitable for storing electronic instructions, and each coupled to a computer system bus. 
[0037| The algorithms and displays presented herein are not inherently related 

to any particular computer or other apparatus. Various general purpose systems may 
be used with programs in accordance with the teachings herein, or it may prove 
convenient to construct more specialized apparatus to perform the required method 
steps. The required structure for a variety of these systems will appear from the 
description below. In addition, the present invention is not described with reference to 
any particular programming language. It will be appreciated that a variety of 
programming languages may be used to implement the teachings of the invention as 
described herein. 

[0038] A machine-readable medium includes any mechanism for storing or 

transmitting information in a form readable by a machine (e.g., a computer). For 
example, a machine-readable medium includes read only memory ("ROM"); random 
access memory ("RAM"); magnetic disk storage media; optical storage media; flash 
memory devices; electrical, optical, acoustical or other form of propagated signals 
(e.g., carrier waves, infrared signals, digital signals, etc.); etc. 
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Overview of JavaScript Execution Model 

[0039J The execution model of JavaScript is quite different from those of other 

programming languages. A typical programming language takes input and produces 
output, possibly with some side effects produced during the execution. In the case of 
JavaScript, the program itself is embedded inside the "output" being computed. Figure 
1 shows an example of a piece of script embedded in an HTML document. This script 
uses a tunction parseName (definition not shown) to obtain a user name from the 
cookie (document . cookie), and then calls a DOM API document . write to update 
the script node with some text. 

[0040] Techniques described herein reflect this execution model. In one 

embodiment, script is handled as embedded in some tree-structured documents 
corresponding to HTML documents. In the operational semantics, script pieces are 
interpreted and replaced with the produced document pieces. The interpretation stops 
when no active script is present in the entire document. In essence, this execution 
model is a particular embodiment of self-modifying code. 

Overview of Program Instrumentation 

[0041] Figure 3A is a flow diagram of one embodiment of a process for 

instrumenting script programs embedded in documents. The process is performed by 
processing logic that may comprise hardware (e.g., circuitry, dedicated logic), software 
(such as is run on a general purpose computer system or dedicated machine), or a 
combination of both. 

[0042] Referring to Figure 3A, the process begins by processing logic 

downloading a document with a script program embedded therein (processing block 
301). In one embodiment, downloading the document is performed by a browser in a 
client. In one embodiment, the document is an HTML document. In one embodiment, 
the script program is JavaScript. In another embodiment, the script program is an 
ECMAscript-based program. In one embodiment, the script program comprises self- 
modifying code. The self-modifying code may comprise dynamically-generated 
JavaScript. 
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[0043] After downloading the document, processing logic inspects the script 

program in the document (processing block 302). 

[0044] Based on the results of the inspection, processing logic rewrites the 

script program to cause behavior resulting from execution of the script to conform to 
one or more policies defining safety and security (processing block 303). In one 
embodiment, rewriting the script program comprises performing syntax-directed 
rewriting on demand. 

[0045] In one embodiment, rewriting the script program comprises inserting a 

run-time check into the script program. In one embodiment, the run-time check 
comprises a security check. In another embodiment, the run-time check comprises a 
user warning. In one embodiment, the run-time check comprises code, which when 
executed at run-time, causes a call to an instrumentation process to instrument the 
script program in the document in response to performing an evaluation operation of 
the document. 

[0046] In one embodiment, rewriting the script program comprises adding 

code to redirect an action through a policy module during run-time execution. In one 
embodiment, the process optionally includes the policy module performing a 
replacement action for the action at run-time where code has been added to redirect an 
action through the policy module. 

[0047] In one embodiment, rewriting the script program comprises parsing the 

document into abstract syntax trees, performing rewriting on the abstract syntax trees, 
and generating instrumented script code and an instrumented document from the 
abstract syntax trees. 

[0048] In one embodiment, the one or more policies are dynamically 

modifiable. In one embodiment, the one or more policies are expressed as edit 
automata. In one embodiment, at least one policy transfers a first action sequence in 
script program to a second action sequence different than the first action sequence. In 
one embodiment, the policies include a policy that combines multiple policies into 
one. 

[0049] In one embodiment, the process further comprises maintaining states 

relevant to an edit automaton of the policy, including a current state and a complete 
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transition function and calling a check function on an action and, in response thereto, 
advancing the state of the automaton and provides a replacement action for the action. 

Higher-order Script 

[0050] In the document pieces produced by JavaScript code at "run-time", 

there could be further JavaScript code embedded therein. This gives rise to a form of 
self-modifying code which is referred to herein as higher-order script (as in script that 
generates other script). For example, the above DOM API document . write allows 
not only plain-text arguments, but also arbitrary HTML-document arguments that 
possibly contain script nodes. These "run-time"-generated script nodes, when 
interpreted, may in turn produce more "run-time"-generated script nodes. In fact, 
infinite rounds of script generation could be programmed. 
[0051] The behavior of an HTML document with higher-order script is 

sometimes difficult to understand. For instance, the two pieces of code fragment in 
Figure 2 appear to be similar, but produce different results. In this example, there is 
implicit conversion from integers to strings, + is string concatenation, and the closing 
tag </script> of the embedded script is intentionally separated so that the parser will 
not misunderstand it as for closing the outer script fragment. The evaluation and 
execution order of higher-order script is not clearly specified in the language 
specification. It is therefore useful to have a rigorous account as provided in the 
operational semantics described herein. 

[0052] More importantly, higher-order script can be exploited to circumvent 

existing program instrumentation, because the instrumentation process cannot 
effectively identify all relevant operations before the execution of the code — some 
operations may be embedded in string arguments which are not apparent until at "run- 
time", including computation results, user input, and documents loaded from URLs. In 
addition, there are many ways to obfuscate such embedded script against analyses and 
filters, e.g., by a special encoding of some tag characters. 
[0053] In one embodiment, the rewritten script program is part of an 

instrumented version of the document, and the instrumented version of the document 
includes hidden script, which is rewritten when the hidden script is generated during 
run-time. In one embodiment, the instrumentation of higher-order script is handled 
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through an extra level of indirection, as demonstrated in Figure 3B. During the 
instrumentation, explicit security events such as load(url) are directly rewritten with 
code that performs pertinent security checks and user warnings (abstracted by safe- 
load(url)). However, further events may be hidden in the generated script doc. 
Without inspecting the content of doc, which is a hardship statically, the doc is fed 
verbatim to some special code instr. The special code, when executed at "run-time", 
calls back to the instrumentation process to perform the necessary inspection on the 
evaluation result of doc. 

[0054] Such a treatment essentially delays some of the instrumentation tasks 

until "run-time", making it happen on demand. Note the code and other components 
of the instrumentation can be implemented in alternative embodiments either by 
changing the JavaScript interpreter in the browser or by using carefully written (but 
regular) JavaScript code which cannot be circumvented or misused. 
(0055] In the case of JavaScript programs, it is sometimes difficult to exactly 

characterize violations. For instance, suppose a web page tries to load a document 
from a different domain. This may be either an expected redirection or a symptom of 
XSS attacks. In such a situation, it is usually desirable to present pertinent information 
to the user for their discretion. 

[0056] In one embodiment, script is modified to prompt the user about 

suspicious behaviors, as opposed to stopping the execution right away. When 
appropriate (e.g., for out-of- boundary windows), the semantics of the code are 
changed (e.g., by "wrapping" the position arguments). In one embodiment, edit 
automata are used (which are more expressive than security automata) to represent 
such policies, and an instrumentation method described herein is designed to enforce 
policies in the form of edit automata. 

[0057] Due to the wide use of JavaScript, it is difficult to provide a fixed set of 

policies for all situations. Customized policies are much desirable — there should be no 
change to the rewriting mechanisms when accommodating a new policy. In one 
embodiment, the same kind of rewriting regardless of the specifics of the policies is 
performed. In one embodiment, the rewriting produces code that refers to the policy 
through a fixed policy interface. 
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[0058] Furthermore, JavaScript and browser security is a mixture of many 

loosely coupled questions. Therefore, a useful policy is typically a combination of 
multiple component policies. 

[0059] In one embodiment, similar to the case of the special code for handling 

higher-order script, the implementation of policy management is performed by either 
changing the JavaScript interpreter or by using regular JavaScript code. In the latter 
case, special care may be given to ensure that the implementation cannot be misused or 
circumvented. 

CoreScript 

[0060] In one embodiment, an abstract language CoreScript is used to 

constrain the behavior of embedded script programs in documents. CoreScript is a 
model of abstract version of JavaScript. One embodiment of an implementation of 
CoreScript is given below. 

[0061] More specifically, CoreScript provides a way to describe the details of 

the instrumentation method below. In particular, operational semantics are given to 
CoreScript, focusing on higher-order script and its embedding in documents. To avoid 
obscuring the invention, objects are omitted from this model, because they are 
orthogonal. Note that adding objects present no new difficulties for instrumentation. 
[0062] One embodiment of the syntax of CoreScript is shown in Figure 4. 

Referring to Figure 4, the symbols [ ] are used as parentheses of the meta language, 
rather than as part of the CoreScript syntax. 

[0063] In one embodiment, the syntax is shown as follows. A complete 

"world" W is a 4-tuple (S, %,B,C). 

[0064] The first element L, a document bank, is a mapping from URLs / to 

documents D. In one embodiment, the document bank corresponds to the Internet. 
The second element & a variable bank, maps global variables x to documents D, and 
functions/to script programs P with formal parameters x . The third element B, a 
browser, consists of possibly multiple windows, and each window has a handle h for 
ease of referencing, a document D as the content, and a domain name d marking the 
origin of the document. The fourth element C, a cookie bank, maps domain names to 
cookies in the form of documents (each domain has its own cookie). In one 
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embodiment, strings are used to model domain names d, paths p, and handles h. A 
URL / is a pair of a domain name d and a path p. An implicit conversion between 
URLs and strings is assumed (which is well-known in the art and done implicitly so as 
not to obscure the present invention). 

[0065] In one embodiment, documents D correspond to HTML documents. In 

JavaScript, all kinds of documents are embedded as strings using HTML tags such as 
<script> and <em>. That is, documents are treated uniformly as strings by program 
constructs, but are parsed differently than plain strings when interpreted. Documents 
in CoreScript reflect this, except that different kinds of documents are made 
syntactically different, rendering the parsing implicit. In one embodiment, a document 
is in either one of three syntactic forms: a plain string (string), a piece of script (js P), 
or a formatted document made up of a vector of sub-documents (FD ). Value 
documents D v are documents that contain no script. A few common HTML format 
tags in the syntax are listed as F, and a new tag jux is introduced to represent the 
juxtaposition of multiple documents (this simplifies the presentation of the semantics). 
[0066] In one embodiment, the script programs P are mostly made up of 

common control constructs, including no-op, assignment, sequencing, conditional, 
while-loop, and function call. In addition, in one embodiment, actions act(A) are 
security-relevant operations that are to be identified and rewritten by the 
instrumentation process described herein. Furthermore, higher-order script is 
supported using write(£), where E evaluates at "run-time" to a document that may 
contain additional script. 

[0067] Expressions E include variables x, documents D, and other operations 

op( E ). In one embodiment, the abstract op construct is used to cover common 
operations which are free of side-effects, such as string concatenation and comparison. 
In one embodiment, booleans are not explicitly modeled and instead they are simulated 
with special documents (strings) false and true. 

[0068] A few actions A are modeled explicitly for demonstration purposes. 

The action newWin(.x,£) creates a new window with E as the content document; a fresh 
handle is assigned to the new window and stored in x. The action closeWin(£) closes 
the window which has handle E. The action loadURLCE) directs the current window to 
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load a new document from the URL E. The action readCki(x) reads the cookie of the 
current domain into x. The action writeCki(is) writes E into the cookie of the current 
domain. All other potential actions may be abstracted as a generic secOp(.E ). Value 
actions A v are actions with document arguments only. Some arguments to actions are 
variables for storing results such as window handles or cookie contents. Such 
arguments are replaced with the notation "_" in value actions, because they do not 
affect the instrumentation. 

10069] Figure 5 illustrates one embodiment of the semantics of expressions in a 

big-step style. At "run-time", expressions evaluate to documents, but not necessarily 
"value documents." Referring to Figure 5, there are a number of rules shown. As 
Rule (2) shows, D is not inspected for embedded script during expression evaluation. 
In Rule (3), op is used to refer to the corresponding meta-level computation of op. 
[0070] Actions are evaluated to value actions as shown in Figure 5. In 

particular, Rule (9) indicates that a cookie may be written with any document D, 
including a document with script embedded. Therefore, a program may store 
embedded script in a cookie for later use. The instrumentation given below will be 
sound under this behavior. 

[0071] Figure 6 illustrates one embodiment of the execution of a world in a 

small-step style. This is more intuitive when considering the security actions 
performed along with the execution, as well as their instrumentation. 
[0072] Referring to Figure 6, rules (11) and (12) define a multi-step relation 

that describes how a world evolves during execution. It is defined as the reflexive and 
transitive closure of a single step relation. The single step relation, defined by Rule 
(13), describes how a world evolves in a single step. It is non-deterministic, reflecting 
that any window could advance its content document at any time. Finally, Rule (14) 
uniformly advances the document in the window of handle h, using some macros 
defined in Figure 7. 

[0073] The macro focus identifies the focus of the execution. It traverses a 

document and locates the left-most script component. The macro stepDoc computes 
an appropriate document for the next step, assuming that the focus of the argument 
document will be executed. The focus and stepDoc cases on value documents (e.g., 
strings) are undefined. This indicates that nothing in value documents can be 

13 



WO 2008/002456 



PCT/US2007/014506 



executed. If nothing can be executed in the entire document, then the execution 
terminates. 

10074] The macro step computes the step transition on worlds. Suppose the 

world W\s making a step transition by advancing the document in window h, and 
suppose the focus computation of the document in window h is P. The result world 
after the step transition would be step(P, h, W). When defining step, the helper adv{B, 
h, x) makes use of stepDoc to advance the document in window h. 
[0075] Note that the semantics dictates the evaluation order for higher-order 

script, thus the two examples in Figure 2 are naturally explained. Take write(o/?( E )); 
P as an example. CoreScript evaluates all E,- before executing the script embedded in 
any of them, explaining the behavior of the first script fragment in Figure 2. P is 
executed after the script generated by write<op( E )) has finished execution, 
explaining the second. 

Security Policies 

[0076] Various security policies can be designed to counter browser-based 

attacks. For instance, these attacks may include the opening of an unlimited number of 
windows (e.g., pop-ups) and send sensitive cookie information to untrusted parties 
(e.g., XSS). 

[0077] In one embodiment, policy management and code rewriting are 

performed by two separate modules. In this way, policies can be designed without 
knowledge of the rewriting process. A policy designer ensures that the policies 
adequately reflect the desired protections. On the enforcement side, the rewriting 
process accesses the policy through a policy interface. The same kind of rewriting is 
used for all policies. 

[0078] One embodiment of a policy framework and a policy interface that 

embodiments of the rewriting method use is given below. 

Policy Representation 

[0079] In one embodiment, policies n are expressed as edit automata. In one 

embodiment, an edit automaton is a triple (£>, g 0 , 8), where Q is a possibly countably 
infinite set of states, qo^ Q is the initial state (or current state), and 8 is the complete 
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transition function that has the form S: Q *A —*■ Q *A (the symbol A is reused here to 
denote the set of actions in CoreScript). The transition function 8 may specify 
insertion, replacement, and suppression of actions, where suppression is handled by 
discarding the input action and producing an output action of e. In one embodiment, 8 
(1j e) = (q, e) so that policies are deterministic. 

[0080] Figure 8 defines the meaning of a policy for one embodiment in terms 

of policy satisfaction (whether an action sequence is allowed) and action editing (how 
to rewrite an action sequence). Referring to Figure 8, rules (15) and (16) define the 
satisfaction of a policy n on an action sequence A . Intuitively, n \-A if and only if 
when feeding A into the automaton of II, the automaton performs no modification to 
the actions, and stops at the end of the action sequence in a state that signals 
acceptance. Below, it is assumed that every state is an "accept" state for simplicity, 
although it is trivial to relax this assumption. 

[0081] Rules (17) and (1 8) define how a policy n transforms an action 

sequence A into another action sequence A '. Intuitively, II \-A => A ' if and only if 
when feeding A into the automaton of II, the automaton produces A '. 
[0082] Because not all edit automata represent sensible policies, it is useful to 

define the consistency of policies. For instance, an edit automaton may convert action 
A\ into A 2 and A 2 into A \. It is unclear how an instrumentation mechanism should act 
under this policy, because even the recommended replacement action does not satisfy 
the policy. Thus, consistency is used to control edit automata. 

Definition 1 (Policy Consistency) A policy II = (Q, q 0 , 5) is consistent if and only if 

8 (q, A) = (q',A r ) implies 8 (q, A 1 ) = {q\ A') for any q, q',A and ,4'. 
Theorem 1 (Sound Advice) Suppose II is consistent. If n \-A => A ', then II \-A '. 

[0083] An inconsistent policy reflects an error in policy design. Syntactically, 

in one embodiment, an inconsistent policy is converted into a consistent one: when the 
policy suggests a replacement action A' for an input action^ under state q, the policy is 
updated to also accept action A ' under state q. More accurately, if 5 (q, A) = (q',A % 
then consistency may be achieved by ensuring that 8 (q ', A 0 = {q ', A •). However, 
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semantically, the policy designer decides whether the updated policy is the intended 
one, especially in the cases of conflicting updates. For instance, in the above example, 
the inconsistent policy may have already defined 5 (q, A *) = (q'\ A "). 
[0084] In one embodiment, consistent policies are used to guide the 

instrumentation described herein, and policy consistency serves as an assumption of 
the correctness theorems discussed herein. Internally, a policy module maintains all 
states relevant to the edit automaton of the policy, including a current state and a 
complete transition function. Externally, the same policy module exposes the 
following interface to the rewriting process: 
• Action review: check^). 

This action review interface takes an input action as argument, advances the internal 
state of the automaton, and performs a replacement action according to the transition 
function. 

[0085] The policy framework described above is effective in identifying 

realistic JavaScript attacks and providing useful feedback to the user. Examples are 
given below that demonstrate the identification of script attacks and the providing of 
useful feedback. 

[0086] For ease of reading, Figure 9 presents edit automata as diagrams. To 

build a diagram from an edit automaton, a node is first created for every element of the 
state set. The node representing the starting state is marked with a special edge into 
the node. If the state transition function maps (q, A) into (q\ A % an edge from the 
node of q to the node of q ' is added, and the edge is marked with Al A '. For 
conciseness, A is used to serve as a shorthand of A/ A. If the state transition is trivial 
(performing no change to an input pair of state and action), that edge may be omitted. 
Conversely, if a diagram does not explicitly specify an edge from state q with action A, 
there is an implicit edge with A/A from the node of q to itself. 
[0087] Figure 10 presents a policy for restricting the number of pop-up 

windows. The start state is popO 1000. State transition on (popO, close) is trivial 
(implicit). State transitions from the states with actions other than open and close are 
also trivial (implicit). This policy essentially ignores new window opening actions 
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when there are already two pop-ups, which is shown with the arrow 1001 that returns 
to state 1002. 

[0088] Figure 1 1 presents a policy for restricting the (potential) transmission of 

cookie information. The start state 1 101 is send-to-any. In state send-to-origin 1 102, 
network requests are handled with a safe version of the loading action called safe- 
loadURL. In this policy, state transitions on (send-to-any, loadURL(t)\ {send-to-any, 
safe-loadURL), (send-to-origin, readCookie), (send-to-origin, safe-loadURL) are 
trivial (implicit). State transitions from the states with actions other than reading, 
loading, and safe loading are also trivial (implicit). Essentially, this policy puts no 
restriction on loading before the cookie is read, but permits only safe loading 
afterwards. 

[0089] The implementation of the safe loading safe-loadURL performs 

necessary checks on the domain of the URL and asks the user whether to proceed with 
the loading if the domain of the URL does not match the origin of the document. If 
desirable, in one embodiment, a replacement action such as safe-loadURL obtains 
information from the current state of the automaton and performs specialized security 
checks and user prompts. Its implementation is part of the policy module and, 
therefore, does not affect the rewriting process. It suffices to understand the 
implementation of safe actions as trusted and cannot be circumvented — safe actions 
are implemented correctly, and malicious script cannot overwrite the implementation. 
[0090] In practice, there are many different kinds of attacks. In one 

embodiment, there are many different policies, each protecting against one kind of 
attack. In one embodiment, multiple (without loss of generality, two) policies are 
combined into one, which in turn guides the rewriting process. 

[0091] For a policy combination (111 © n 2 = n) to be meaningful, it is sensible 

to require the following two conditions. 

1 . Safe combination: Suppose II i and Yl 2 are consistent. For all A , Tli © n 2 
\-A if and only if n, \-A a.ndYl 2 \-A. 

2. Consistent combination: If n t and U z are consistent, then n ( ® U 2 is 
consistent. 
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[0092] A definition of policy combination that respects these requirements is 

as follows: 

Given two edit automata ri| = ({p, 1 1 = 0 . . .«}, po, 5i) and n 2 ({qy\j = 0 . . .m}, q 0 , 02), 
then: 

n, © n 2 = ({p/9y 1 1 = 0 — n,j = 0 . . .m},poq 0 , 5) 

Up,q k ,A') if 8,{p l ,A) = (p„A>)and 5 2 {q;,A')=(q k ,A') 
where s{p l q J , a) = i (j>,q k , A') else if S 2 (qj ,A)=(q k ,A') and <5, {p, , A') = (p, , A') 

\{Pi4 j * e ) otherwise 

[0093] Intuitively, in one embodiment, the combined policy simulates both 

component policies at the same time. When the first policy suggests an action that is 
agreed to by the second policy, the combined policy takes that action. If not, the first 
policy tries to see if the suggestion of the second policy is agreed to by the first policy. 
In the worse case that neither of the above two holds, the combined policy suppresses 
the action. There is a combinatorial growth in the number of states after the 
combination. In one embodiment, this does not pose a problem for an implementation, 
because a policy module maintains separate state variables and transition functions for 
the component policies, yielding a linear growth in the policy representation. 
[0094] It is not difficult to check that this definition of combination satisfies 

the above safety and consistency requirements. Nonetheless, note that there exist other 
sensible definitions of combination that also satisfy the same requirements. For 
example, the above definition "prefers" the first policy over the second. A similar 
definition that prefers the second is also sensible. Furthermore, a more sophisticated 
combination may attempt to resolve conflicts by recursively feeding suggested actions 
into the automata, whereas the above simply gives up after the first try. Note that, in 
one embodiment, the requirement of "safe combination" only talks about acceptable 
action sequences, not about replacement actions. 

CoreScript Instrumentation 

[0095] Given the policy module and its interface described above, the 

instrumentation of CoreScript becomes a syntax-directed rewriting process. 
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[0096] The task of the rewriting process is to traverse the document tree and 

redirect all actions through the policy module. Whenever an action act(A) is 
identified, the action is redirected to the action interface check(/i), trusting the policy 
module to perform an appropriate replacement action at "run-time". Upon receiving a 
higher-order script write(ii), the document is fed argument E verbatim to a special 
interface instr(£), whose implementation calls back to the rewriting process at "run- 
time" after E is evaluated. 

[0097] In one embodiment, the above two interfaces are organized as two new 

CoreScript instructions for the rewriting process to use. In particular, the syntax of 
CoreScript is extended as follows. 

(Script) P ::=... | instrtE) | check(^) 
[0098] Figure 1 2 illustrates the details of the rewriting process. In this process, 

no knowledge is required on the meaning or the implementation of the two new 
instructions. The non-trivial tasks are performed by Rules (19) and (20), where the 
new instructions are used to replace "run-time" code generation and actions. All other 
rules propagate the rewriting results. The rewriting cases for the two new instructions 
are given in Rule (24), which allows the rewriting to work on code that calls the two 
interfaces. The rewriting on world Wand its four components are also defined. In one 
implementation, some components (e.g., the document bank Z) will be instrumented 
on demand (e.g., when loaded). 

[0099] The semantics of the two new instructions are given so as to reason 

about the correctness of the instrumentation. For instr(£), the purpose is to mark 
script generation and delay the instrumentation until "run-time". Therefore, its 
operational semantics evaluate the argument expression and feed it through rewriting. 
The following definitions capture that. 
focus(js instr (£)) = instr (E) 

stepDocfjs instr (£), x) = where x «- E UZ> (33) 
step (instr (£), h (T., X , B, Q) = (X, x , adv(B, h, *),Q 

[00100] Recall that adv is defined in Figure 7. The operational semantics rules 

for other language constructs remain the same under the addition of instr. The focus 
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and step function cases defined above fit in well with Rule (14), which makes a step on 
a document given a specific window handle. 

[00101] Inspecting Rule (33), the rewriting process i is called at run time after 

evaluating E to D. In one embodiment, execution of i always terminates, producing an 
instrumented document. In this instrumented document, there is potentially further 
hidden script marked by further instr. Such hidden script will be rewritten later 
when it is generated. 

[00102J The semantics of <±\&ck(A) are defined in a similar fashion using the 

following definitions. 
focusijs check (/I)) = check (A) 

stepDoc(js check (A), x)=e (34) 
step (check (A), h (S, x. B, C)) = undefined 

[00103] The focus case for check^) is trivially check(A) itself. The execution 

of check(y4) consumes check(/l) entirely and leaves no further document piece for the 
next step, hence being the stepDoc case. The step case is undefined, because we will 
not refer to this case in the updated operational semantics. 

[00104] With the addition of check, the program execution is connected to the 

policy module. Therefore, in the updated operational semantics, the internal state of 
the policy module (the state of the edit automaton) is taken into account. In one 
embodiment, the reduction relations of CoreScript in Figure 13 are extended, where 
the new formations of the reduction relations explicitly specify the automaton 
transition function (5) and the automaton states (q and q*). Similar to the previous 
semantics, the multi-step relation defined by Rules (35) and (36) is a reflexive and 
transitive closure of a non-deterministic step relation defined by Rule (37). This non- 
deterministic step relation is defined with help of a deterministic step relation, which is 
referred to herein as "document advance." 

[00105] Document advance is defined by Rules (38) and (39). When the focus 

of the document is not a call to check, the old document advance relation (defined in 
Rule (14)) is used, and the automaton state remains unchanged. When the focus is a 
call to check, the automaton state is updated and the replacement action is produced 
according to the transition function, and the world components are updated using the 
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step case of act^ 1 ) because the replacement action A v is performed instead of the 
original action^. 

[00106] Thus, a policy instance is executed alongside with the program 

execution — the current state of the policy instance is updated in correspondence with 
the actions of the program. 

Concretizing CoreScript 

[00107] In one embodiment, CoreScript is modeled as a core language for 

client-side scripting. Its distinguishing features include the embedding of script in 
documents, the generation of new script at "run-time", and the security-relevant 
actions. The ideas described above are also applicable to other browser-based 
scripting languages. 

[00108] First, CoreScript supports the embedding of code in a document tree 

using j s nodes. Such a treatment is adapted from the use of <script> tags in 
JavaScript (Figure 1 provided an example). Beyond the <script> tags, there are 
many other ways for embedding script in an HTML document. Some common places 
where script could occur include images (e.g., <IMG SRC=...>), frames (e.g., <IFRAME 
SRC=...>), tables (e.g., <TABLE BACKGROUND^.. >), XML (e.g., <XML SRC=...>, and 
body background (e.g., <BODY BACKGROUND^.. >. Furthermore, script can also be 
embedded in a large number of event handlers (e.g., onActivate(), onClick(), 
onLoadO, onUnload(), . . . ). In one embodiment, such embedded script is also 
identified and rewritten. 

[00109] Second, CoreScript makes use of write(£) to generate script at "run- 

time". This is a unified view on several related functions, including eval in the 
JavaScript core language and window . execScript, document . write, 
document . writeln in the DOM. These functions all take string arguments. The 
function eval evaluates a string as a JavaScript statement or expression and returns the 
result. The function window . execScript executes one or more script statements but 
returns no values. CoreScript' s treatment on higher-order script is expressive enough 
for these two. 
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[001 10] However, the functions document . write and document . writeln are 

more challenging. These two functions send strings as document fragments to be 
displayed in their windows, where the document fragments could have script 
embedded. These document fragments do not have to be complete document tree 
nodes, and instead, they can be pieced together with other strings to form a complete 
node, as demonstrated in the following examples. 
<script> 

document . write ( "<scr" ) ; 
document . write ( "ipt> malic" ) ; 
var i = 1; 

document. write ("ious code; </sc"); 
document . write ( "ript>" ) ; 
</script> 

<script> 

document . write ("<scr") ; </script>ipt> 

malicious code 

</script> 

[00111] Each of the above write functions appears to produce harmless text to 

a naive filter. To avoid such loopholes when applying CoreScript instrumentation, in 
one embodiment, generated document fragments are pieced together before fed into 
the rewriting process of the next stage. This is done with care to avoid changing the 
semantics of the code (recall Figure 2). Observing that the expressiveness of 
producing new script as broken-up fragments does not seem to be useful in well- 
intended programs, a better solution might be to simply disrupt the generation of 
ungrammatical script pieces. 

[00112] In one embodiment, CoreScript does not provide a way to modify the 

content of a document in arbitrary ways, because a writeCE) node generates a new 
node to be positioned at the exact same location in the document tree. The DOM 
provides other ways for modifying a document. For instance, a document could be 
modified through the innerHTML, innerText, outerHTML, outerText, and 
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nodeValue properties of any element. In one embodiment, these are not covered in 
the CoreScript model. Nonetheless, an extension is conceivable, where the mechanism 
for "run-time" script generation specifies which node in the document tree is to be 
updated. The instrumentation method remains the same, because it does not matter 
where the generated script is located, as long as it is rewritten appropriately to go 
through the policy interface. 

[00113] Lastly, in one embodiment, CoreScript includes some simple actions for 

demonstration purposes. A realization would accommodate many other actions 
pertinent to attacks and protections. Some relevant DOM APIs include those for 
manipulating cookies, windows, network usage, clipboard, and user interface 
elements. In addition, it is useful to introduce implicit actions for some event handlers. 
For instance, the "undead window" attack below could be prevented by disallowing 
window opening inside an onUnload event. 

<html> 

<head> 

<script type="text/javascript"> 

function respawn() { window. open ( "URL/undead . html") } 

</script> 

</head> 

<body onunload=" re spawn ( ) "XTontent of undead. html< /body > 
</html> 

An Example Implementation Architecture 

[00114] Figure 14 illustrates an example of an implementation architecture. 

Each of the modules may comprise hardware (circuitry, dedicated logic, etc.), software 
(such as is run on a general purpose computer system or a dedicated machine), or a 
combination of both. 

[00115] Referring to Figure 14, the implementation may extend a browser with 

three small modules — module 1401 for the syntactic code rewriting (i), module 1402 
for interpreting the special instruction (instr), and module 1404 for implementing the 
security policy (11). In one embodiment, a browser 1403 does not interpret a document 
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D directly. Instead, browser 1403 interprets a rewritten version i (£>) produced by the 
rewriting module. Upon a special instruction instr(£), the implementation of instr 
evaluates the expression E and sends the result document D' through rewriting module 
1401. The result of the rewriting i (£)*) is directed back to browser 1403 for further 
interpretation. Upon a call to the policy interface check(v4), policy module 1404 
advances the state of the automaton and provides a replacement action A '. 
[00116] In one embodiment, rewriting module 1401 is implemented in Java, 

with help of ANTLR for parsing JavaScript code. For more information on ANTLR, 
see T. Parr et al. ANTLR reference manual (available at http://www.antlr.org/, Jan. 
2005). This module parses HTML documents into abstract syntax trees (ASTs), 
performs rewriting on the ASTs, and generates instrumented JavaScript code and 
HTML documents from the ASTs. In one embodiment, browser 1403 is set up to use 
rewriting module 1401 as a proxy for all HTTP requests. 

[00117] In one embodiment, the special instruction can be implemented by 

modifying the JavaScript interpreter in browser 1403 according to the operational 
semantics given by Rule (33) given above. After the interpreter parses a document 
piece out of the string argument (abstracted by the evaluation relation in Rule (33)), 
rewriting module 1401 is called to perform rewriting of script. The interpretation 
resumes afterwards with the rewritten document piece. 

[00118] Although the above is straightforward, it requires changing the 

implementation of the browser. Alternatively, one may opt for an implementation 
within the regular JavaScript language itself, where instr is implemented as a 
JavaScript function. The call-by-value nature of JavaScript functions evaluates the 
argument expression before executing the function body, which naturally provides the 
expected semantics. For example, in one embodiment, an XMLHttpRequest object 
(A. van Kesteren and D. Jackson. The XMLHttpRequest object. W3C working 
draft, available at http://www.w3.org/TR/XMLHttpRequest/, 2006.) (popularly known 
as part of the Ajax approach) is used to call the Java program of the rewriting module 
from inside JavaScript code. 

[00119] Although convenient, this approach is not as robust as that of modifying 

the JavaScript interpreter, because it is more vulnerable to malicious exploits. As 
discussed above, JavaScript provides some form of self-modifying code, e.g., through 
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inner HTML. This presents a possibility for malicious script to overwrite the 
implementation of instr, if instr is implemented in JavaScript and interpreted 
together with incoming documents. Additional code inspection is needed to protect 
against such exploits, which makes the implementation dependent on some 
idiosyncrasies of the JavaScript language. Therefore, it may be more desirable to 
modify the interpreter when facing a different tradeoff. 

[00120] Similar implementation choices apply to the policy module. For 

example, one can implement the policy module as an add-on to the browser with the 
expected policy interface. In one embodiment, the policy module is also implemented 
in regular JavaScript — check is implemented as a regular JavaScript function and calls 
to check are regular function calls with properly encoded arguments that reflect the 
actions being inspected. The body of the check function performs the replacement 
actions, which are typically the original actions with checked arguments and/or 
inserted user prompts. The above protection for instr against malicious exploits 
through self-modifying code also applies here. 

[00121] In one embodiment, policies are enforced per "document cluster." A 

browser may simultaneously hold multiple windows, and some of these windows 
communicate with each other (e.g., a window and its pop-up, if the pop-up holds a 
document from the same origin); these are referred to herein as being in the same 
cluster. In one embodiment, each cluster is given its own policy instance in the form 
of a JavaScript object, and all windows in the same cluster are given a reference to the 
cluster's policy instance, which is properly set up when windows are created or 
documents are loaded. This does not affect the essence of the instrumentation. 
Nonetheless, the per-cluster enforcement is necessary for expressing practical policies. 
On the one hand, in one embodiment, documents from different clusters do not share 
the same policy instance, so that the behavior of one document would not affect what 
an unrelated document is allowed to do (e.g., two unrelated windows may each have 
their own quota of pop-ups). On the other hand, documents from the same cluster 
share the same policy instance to prevent malicious exploits (e.g., an attack may 
conduct relevant actions in separate documents in the same cluster). 

Correctness off the Method 
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[00122] For purposes herein, the correctness of the instrumentation is presented 

as two theorems — safety and transparency. Safety states that instrumented code will 
respect the policy. Transparency states that the instrumentation will not affect the 
behavior of code that already respects the policy. The intuition behind these 
correctness theorems is straightforward, since the instrumentation described herein 
feeds all actions through the policy module for suggestions. Safety holds because the 
suggested actions satisfy the policy due to policy consistency. Transparency holds 
because the suggested actions would be identical to the input actions if the input 
actions already satisfy the policy. In what follows, these two theorems are established 
with a sequence of lemmas. 

[00123] First, a notion of orthodoxy is introduced. 

Definition 2 (Orthodoxy) W(pr X> B, C, D, P) is orthodox if it has no occurrence 
of act(A) or writeOE). 

[00124] Note that, in one embodiment, the instrumentation described herein 

produces orthodox results, as in the following lemma. 

Lemma 1 (Instrumentation Orthodoxy) </>), *(£>), <Q, i(B) y i(x), and t(W) are 
orthodox. 

Proof sketch: By simultaneous induction on the structures of P and D. By case 
analysis on the structures of C, B, x, 2, and W. 

[00125] The orthodoxy is preserved by the step relation as shown as follows. 

Lemma 2 (Orthodoxy Preservation) If Wis orthodox and 
*-s(W,q)^(W,g r ) : A\ then W is orthodox. 

Proof sketch: By definition of the step relation (-»), with induction on the structure of 
documents. The case of executing write(£) is not possible because Wis orthodox. 
In the case of executing inst r(£), the operational semantics produces an 
instrumented document to replace the focus node. Orthodoxy thus follows from 
Lemma 1 . In all other cases, the operational semantics may obtain document pieces 
from other program components, which are orthodox by assumption. 
[00126] The execution of an orthodox world respects the policy, as articulated 

below. 
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Lemma 3 (Policy Satisfaction) Suppose n = (Q, q, 5) is consistent. If W is orthodox 
and I- s (W,q)-+ (W, q 1 ) : A\ then S(q, A") = {q A*). 

Proof sketch: By case analysis on the step relation {-*). In the case of executing 
check(y4), by inversion on Rule (39), 5(g, A]) = (q', Ay The expected result S(q, A") = 
(g', A v ) follows directly from the definition of policy consistency. In all other cases, by 
inversion on Rule (38), q = q'. By further inversion on Rule (14), A v = e (the case of 
executing a ct (A) is not possible because W is orthodox). S(q, e ) = (q, <= ) because of 
the deterministic requirement on policies. 

[00127] The safety theorem follows naturally from these lemmas. 

Theorem 2 (Safety) Suppose n = (Q, q, S) is consistent. If Wis orthodox and i- 6 (W, 

q) -** {W\ q 1 ) : A', then n \-A". 

Proof sketch: By structural induction on the multi-step relation {-**). The base case of 
zero step and empty output action is trivial. In the inductive case, there exists W\, q\ 
and A] such that h s (W, q) -* (W x , q x ) : A), t- s (W u q{) (W, q') and A v ' , and A v 

= AJ A* 1 . By Lemma 3, d (q, A") = (q u A*). (Q, q u 5) is consistent by assumption and 
definition of policy consistency. W\ is orthodox by Lemma 2. By induction 
hypothesis, {Q, q \ , 5) h A"' . By definition of policy satisfaction, U\-A v . 
[00128] From the instrumentation's perspective, it is desirable to establish that 

i(W) is safe given any W. This follows as a corollary of Theorem 2, because K.W) is 
orthodox by Lemma 1 . 

[00129] To formulate the transparency theorem, the multi-step relation defined 

above is used before the instrumentation extension. This reflects the intuition that 
incoming script should be a sensible CoreScript (or JavaScript) program without 
knowledge about the policy module. A lock step lemma is introduced to relate the 
single-step execution of instrumented code with the single-step execution of the 
original code in the case where the original code satisfies the policy. 
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Lemma 4 (Lock step) If W-+ W : A v and 6 (q, A") - (q\ A"), then t- MW), 4) ~* 
q')'.A v . 

Proof sketch: By definition of the step relation (-*), with induction on the structure of 
documents. The focus of t(W) refers to a tree node in correspondence with the focus 
of W. 

[00130] In the case that write(£) is the focus of W, instr(£) will be the 

focus of / ( W) . The operational semantics ofwrite and instr perform a similar 

evaluation on the argument E, except that instr(£) uses an instrumented variable 
environment and returns an instrumented result document. The output action A v is e 
in both cases. We can construct the derivation i- 6 (i(W), q) -+ (t ( W), q') : A v by: (i) 

following Rule (37) and choosing the same handle h as used for obtaining W~* W : A v ; 
(ii) following Rule (38), which refers back to the old single-step relation h K t(W) ~*t 
(W) : A v \ then (iii) following the derivation of h h W~*W : A v used for obtaining 
W-*W : A v , with various components replaced with the instrumented version. 
[00131] In the case that act(^) is the focus of W, chec k(A) will be the focus 

of l{W). act and check both produce an empty string to replace the focus tree node. 
The operational semantics of act(^4) evaluate A to A v (Rule (14)). The operational 
semantics of check^) evaluate A to A v and feed A v to the policy (Rule (39)). By 
assumption, S (q, A") = (q', A*). Therefore, in one embodiment, act and check 
produce the same output action in this case. The operational semantics of check(/l) 
will further apply the macro step to actM") to update the world components. 
Therefore, further derivations of the two reductions follow the same structure. 
[00132] In all other cases, W and l(fV) are executing the same instructions. The 

derivation of the instrumented reduction follows that of the original reduction. The 
transparency theorem follows naturally from the lock step lemma. 
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Theorem 3 (Transparency) If W ^* W: A" and (g, q, 8) \- A\ then l- S (l(W), q) 

Proof sketch: By structural induction on the multi-step relation (->*). The base case of 
zero step and empty output action is trivial. In the inductive case, there exists W\ and 
A] such that W-+ W x : A], W^* W : A v ' , and A v = A° A v ' . By assumption (Q, q, S) 

t-A* and definition of policy satisfaction, there exists qi such that 5 {q, A") = (qi, A]} 
and (Q, q u 5) \- A"' . By Lemma 4, h s q) ~< qi) ■ A]. By induction 

hypothesis, h s qi) -+*(K.W% q 1 ) : A v ' . By Rule (36), I- s (l(W), q) -*«»"), l') ■ 

A". 

[00133] In the above transparency theorem, the original world Wdoes not refer 

to the instrumentation and policy interfaces, reflecting that incoming script is written 
in regular JavaScript. A variant of the transparency theorem can be formulated to 
allow incoming script that refers to the instrumentation and policy interfaces, as 
follows. 

Theorem 4 (Extended Transparency) If i- 5 {W, q) ^*{W, q') : A" and (Q, q, S) v A\ 
then h^W, q) q') : A\ 

This theorem allows W to be unorthodox — W may contain a mixture of write , 
act, instr and check. The proof of this theorem requires a similarly extended 
lockstep lemma. The proof extension is straightforward, because on the two new cases 
allowed by this theorem (instr and check), the rewriting is essentially an identity 
function. 

An Example of a Computer System 

[00134] Figure 1 5 is a block diagram of an exemplary computer system that 

may perform one or more of the operations described herein. Referring to Figure 1 5, 
computer system 1 500 may comprise an exemplary client or server computer system. 
Computer system 1500 comprises a communication mechanism or bus 1511 for 
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communicating information, and a processor 1512 coupled with bus 1511 for 
processing information. Processor 1512 includes a microprocessor, but is not limited 
to a microprocessor, such as, for example, Pentium™, PowerPC™, Alpha™, etc. 
[00135] System 1 500 further comprises a random access memory (RAM), or 

other dynamic storage device 1504 (referred to as main memory) coupled to bus 1511 
for storing information and instructions to be executed by processor 1512. Main 
memory 1504 also may be used for storing temporary variables or other intermediate 
information during execution of instructions by processor 1512. 
[00136] Computer system 1 500 also comprises a read only memory (ROM) 

and/or other static storage device 1506 coupled to bus 1511 for storing static 
information and instructions for processor 1512, and a data storage device 1507, such 
as a magnetic disk or optical disk and its corresponding disk drive. Data storage 
device 1507 is coupled to bus 1511 for storing information and instructions. 
[00137] Computer system 1 500 may further be coupled to a display device 

1521 , such as a cathode ray tube (CRT) or liquid crystal display (LCD), coupled to bus 
1511 for displaying information to a computer user. An alphanumeric input device 

1 522, including alphanumeric and other keys, may also be coupled to bus 1511 for 
communicating information and command selections to processor 1512. An additional 
user input device is cursor control 1523, such as a mouse, trackball, trackpad, stylus, or 
cursor direction keys, coupled to bus 1511 for communicating direction information 
and command selections to processor 1512, and for controlling cursor movement on 
display 1521. 

[00138] Another device that may be coupled to bus 1511 is hard copy device 

1 524, which may be used for marking information on a medium such as paper, film, or 
similar types of media. Another device that may be coupled to bus 1511 is a 
wired/wireless communication capability 1 525 to communication to a phone or 
handheld palm device. 

[00139] Note that any or all of the components of system 1500 and associated 

hardware may be used in the present invention. However, it can be appreciated that 
other configurations of the computer system may include some or all of the devices. 
[00140] Whereas many alterations and modifications of the present invention 

will no doubt become apparent to a person of ordinary skill in the art after having read 
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the foregoing description, it is to be understood that any particular embodiment shown 
and described by way of illustration is in no way intended to be considered limiting. 
Therefore, references to details of various embodiments are not intended to limit the 
scope of the claims which in themselves recite only those features regarded as essential 
to the invention. 



31 



WO 2008/002456 



PCT/US2007/014506 



CLAIMS 

We claim: 

1 . A method comprising: 

downloading a document with a script program embedded therein, wherein the 
script program comprises self-modifying code; 
inspecting the script program; and 

rewriting the script program to cause behavior resulting from execution of the 
script to conform to one or more policies defining safety and security. 

2. A proxy comprising: 

a policy management module to implement a security policy; 

a rewriting module to perform a rewriting process to rewrite a script embedded 
in a document based on the security policy, wherein the script program comprises self- 
modifying code, wherein the rewriting process instruments the document based on one 
or more policies to control the script in the document so that behavior resulting from 
execution of the script conforms to safety and security requirements; 

an interpretation module to interpret instructions added to the scripts during 
rewriting. 

3 . An article of manufacturing having one or more machine-readable 
media storing instructions which, when executed by a machine, cause the machine to: 

download a document with a script program embedded therein, wherein the 
script program comprises self-modifying code; 
inspect the script program; and 

rewrite the script program to cause behavior resulting from execution of the 
script to conform to one or more policies defining safety and security. 
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<head> . . . </head> 
<body> 
<p> 

<script> 

var name=parseName (document . cookie) ; 
document .write ("Greetings, " + name + "!"); 

</script> 

Zt is <em>erreat</em> to see you! 

</p> 
</body> 



| head | | body | 



| script [ [""It^ia" | |~em [ [""to see you ! 

| "groat" | 



Figure 1 



Script fragment: 



.write("<acript> i = 2 ,- document . write < i ) ; </scr" -► "ipt>" + i > ; 
Output: 21 

Script fragment: 



Figure 2 
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Download a document with a script program 
embedded therein 

301 

1 

Inspect the script program in the document 
302 



Rewrite the script program to cause behavior 
resulting from execution of the script to conform to 
one or more policies defining safety and security 
303 



Figure 3A 
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A. 



3E] - □ 



f^pl ... H E 

aafa-load(url); 



(at load tlmo) script! 



mata-loadlarX) ; 

doc .obfuscated s 
lnrtr(doc) ; 



A. 



3E] - EZE 



instrl-Joadfurj;") 



Figure 3B 



(World) 


W 


:= (r iX) B,C) 


(Document Bank) 


£ 


■-= {[I = »)'} 


(Variable Bank) 


X 


.= {[* =D]\[f = (Z)PY] 


(Browser) 


B 


:={[/. = D € d]'} 


(Cookie Bank) 


c 


:={[d = D)'} 


(URL) 


I 


:=d.p 


(Domain) 


d 




(Path) 


p 




(Handle) 


h 




(Document) 


D 


:= string \jsP\FD 


(Value Document) 


D v 


:= string j F Z>" 


(Format) 


F 


:= jux | p | em | b | i | HI | h2 | . . . 


(Script) 


P 


:= skip | x = £ | P, P | if £ then P else P 
| while E do P | f(E) | act(/4) | write(S) 


(Expression) 


E 


:= x | £> | op(E) 


(Action) 


A 


:= e | newWin(x, E) | closeWin(£;) | loadURL(f7) 
| readCki(x) | writeCki(£0 | secOp(F) 


(Value Action) 


A° 


:= e | newWin(_, £>) | clo 3 eWin(£>) | loadURL(D) 
t readCki(_) | writeCki(£?) | secOp(D) 



Figure 4 
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X t~ op(E) * op(D) 



X H newWin(i. B) 4 newWin(_. D) 

xy-B^p 

X h cloaeWin(£) 4 closeWin(jD) 

Xl"g|D 

X H loadURL(-E) 4 loadURL(£>) 

X I- roadCki(i) ^ readCki(-) 

yl-g^D 

X I" writeCki(£) 4 writeCki(ZJ) 

Xl-g^D 

X h secOp(J) 4 secOp(D) 

Figure 5 



W — * W : 



W = (£, x. S. C) B = {hi = £\ e d^-O-") 
Pick any j : h W -~» IV' : A v 



\h\- W — W : A"\ 



B(h) = D ed focus(D) = P 

step(P, li, (£, x. B, <7)) = W 

a* - f A i if P= act(i4) and x t" >4 4 A\ 

~ { e ifP^act(A) 

/ih (E,x, B,C)-^W:A V 



CD 
(2) 



EEHZ] _— (4) 

Xl-e^e 



(5) 

(6) 

(7) 
(8) 

(9) 

(10) 

(11) 
(12) 



Figure 6 
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Jocus(D) = P 
atepDoc(D, X ) = D' 



lfD = 


then jocus(D) = 


and atepDoc(D, x) = 


ja P where P 6 {skip, x = E, 
act(/t)} 


P 


e (empty string) 


js urite(S) 


write(£) 


D where 


js Pi , Pi 


/ocixsOs A) 


jux D (js Pt) 

where D = stepDocfjs Pi. x) 


ja if E then. Pi else P 2 


if E then P» else P 2 


js Pi if x I" E -U- 
jsP 2 if x H i?JJ. false 


js while -E do P 


while BdoF 


js if Z? then (P; while E do P) 
else skip 


js HE) 


/(E) 


js P[D/x\ where * h £ 4 # 
and X (/)=(x)P 


F D V D'D 

where D' is not a value document 




F D»D"D 

where D" = stepDoc(D', x) 



step(P,h, W) = W'\ 


IfP = 


the 


t step{P,h, (£, 


*.2?.C)) = 






act(e) 


(£. 




adv(B, A, x). 


C) 




act(newWin(*,E)) 


(E. 


x{* = fc'>, 


*'{/»' = /? e rf}, 


C) 


where x E ^ rf.p 

B' = adv(B,h,x 
E(d.p) = £> 
/i' is fresh 


act(closeWin(£')) 


(E, 




B' - {/»'}, 


C) 


where x 1" E ii- h' 

B' — adv(B, h, x 


act(loadURL(.E)) 


(B. 


X. 


B{h = D 6 ri}. 


C) 


where x *~ E 4J. d.p 
S(d.p) = D 


act(readCki(:r)) 


(E. 


x{x = C(d)}, 


adv(B, ft,x). 


C) 


where B(/») = I>6rf 


act(writeCki(J5;)) 


(E. 


X, 


adv(B, h,x). 


C{d = O}) 


where x l~ E U- D 

B(/i) = Ded 


act(secOp(tf)) 




x = E 


(E. 


X{* = D\, 


adv(B, h. x). 


C) 


where x r- £ -D 


other P 


(E, 


X, 


adv(JS, /i,x), 


C) 





where adi/(fi, /», x) = B{h = stepDoc(D, x) e d} if B(h) = D e d 
Figure 7 
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</ e {accept state} 
(Q. «/,<$) He 
H<1, A) = (Q,q',6)hA 
(Q,q,S)hAA 



5{< h A) = {g',A') (Q,tf,8) \- A A' 
(Q, q, 5) h AA =j» A' A' 



Figure 8 



05) 
(16) 

(17) 
(18) 



Figure 9 



open open open/e 




Figure 10 



loadDRi ri; /safe-IoadDRL CU 




Figure 1 1 
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*(wrif (£)) = instr(£) (19> 
(20) 



t(act(/4)) = check(y») 



t(if E then P, else P 2 ) = if £ then i(Pi) else i(P 2 ) 



i(while E do P) = Hhile £ do t(P) 
P 6 {skip, ! = £,/(£?), instr(g), checkQ4)} 
,.(P) = P 



i(jsP)=jsi.(P) 



l'(x) = x / l 



t({(/ = i?r}) = {(/ = t (D))*} 



|t(fl) = fl'l 



<.({(/» -De d)'}) = {(ft = e d)-} 



(21) 
(22) 
(23) 

(24) 



l^) = p'l T(^y^^ (25> 

(26) 



(27) 
(28) 



(30) 



N C > = C 'l -£»]•}) = {(d-t(D)]-} (31> 

(32) 

i((E, x, = ( t (£), i(C)) ^ 

Figure 12 
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h-a (M'.qW* {W'.q') : A" 



*S {W, <?) ~ {W\ q') : .4" h s {W, q') ~' {W", q") : A" 
r-i {W.q) — * (W",q") : A»A« 



h- a (VK, V )- (W',q'):A* 



W = (£, B. C) B = {hi = £\ 6 *}*=<»■■■«> 
Pick any j : t-^ (VK, q) ( W, q') : A" 

\- S (W,q)<~(W'.q'):A» 



\h\- s {W, q )^{W>,q>):A«\ 



B(h) =Ded focus(D) ^ check(yl) 

h (- (£, Xt-B-C) ~~* W : A° 

h h s ((£, X , B, C), q) ~ ( W, q) : A" 

B(h) = D 6 d focus(D) = check(/4) 
X y-AUA? S(q,AX) = W,A") 
s<ep(act(/t"), h, (S, X . g, C)) = IV 
hh 6 ((E.x.fl.C),?)-^?'):^ 



Figure 13 
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DISPLAY 
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